[R-sig-ME] Interpretation of lmer output in R

Douglas Bates bates at stat.wisc.edu
Mon Feb 21 20:48:23 CET 2011


Actually it only went to me because you sent the original to
r-sig-mixed-models-ownder at r-project.org, not to the list itself.

On Mon, Feb 21, 2011 at 1:44 PM, Julia Sommerfeld
<Julia.Sommerfeld at utas.edu.au> wrote:
>
> not sure if my email came through...
>
> -------
>
>
> Dear Douglas and list member,
>
> Thank you heaps for your answers. The interpretation of the summary output
> (lmer) is becoming much clearer now. I have to admit I had a slightly (not
> to say HUGE) different idea of the summary output.
>
> But a few questions still remain...
>
> I have tried out the suggestions with the following results:
>
> 1. I tested if "Sex" is an important factor in the model:
>
> fm<-lmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
> fm1<-lmer(SameSite~BreedSuc1+(1|Bird), family="binomial")
> anova(fm1,fm)
>
> Data:
> Models:
> fm1: SameSite ~ BreedSuc1 + (1 | Bird)
> fm: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
>     Df    AIC    BIC  logLik  Chisq Chi Df Pr(>Chisq)
> fm1  3 75.518 81.485 -34.759
> fm   4 77.379 85.335 -34.690 0.1387      1     0.7096
>
>
> 2. Since Sex is not "important",  I compared fm1 with fm2 to test if
> BreedSuc1 is an important factor:
>
> fm2<-lmer(SameSite ~ 1 + (1|Bird), family="binomial")
> anova(fm2,fm1)
>
> Data:
> Models:
> fm2: SameSite ~ 1 + (1 | Bird)
> fm1: SameSite ~ BreedSuc1 + (1 | Bird)
>         Df    AIC        BIC      logLik      Chisq     Chi Df
>  Pr(>Chisq)
> fm2  2     77.617     81.595     -36.808
> fm1  3     75.518      81.485     -34.759     4.0991          1
>  0.04291 *
>
>
> BUT: Ben (thanks for the input) disagreed here:
> "I'm afraid I have to disagree with Doug here. This kind of model reduction,
> while seemingly sensible (and not as abusive as large-scale, automated
> stepwise regression), is a mild form of data snooping. Don't do it ..."
>
> If I don't do it, what could be an alternative? Simply rely on the output of
> fm1 without looking at fm2?
>
>
> 3.  From the output above, I conclude that BreedSuc1 has an effect on
> SameSite:
>
> summary(fm1)
>
> Generalized linear mixed model fit by the Laplace approximation
> Formula: SameSite ~ BreedSuc1 + (1 | Bird)
>    AIC       BIC         logLik     deviance
>  75.52     81.48     -34.76    69.52
> Random effects:
>  Groups Name            Variance     Std.Dev.
>  Bird   (Intercept)         0.12332      0.35117
> Number of obs: 54, groups: Bird, 46
>
> Fixed effects:
>                     Estimate     Std. Error     z value     Pr(>|z|)
> (Intercept)      -0.2135       0.3794          -0.563        0.5736
> BreedSuc11    1.1831         0.5921           1.998        0.0457 *
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Correlation of Fixed Effects:
>            (Intr)
> BreedSuc11 -0.638
>
>
> Now, again the interpretation of the Fixed effects:
>
> BreedSuc1 has a significant effect on SameSite.
>
> #site-fidelity (SameSite=0)  of "bird" (bird(!), since I droppped "Sex"?)
> who was unsuccessful in breeding in the previous season (BreedSuc1=0),
> corresponds to a probability of 44%:
> plogis(-0.2135)
> [1] 0.4468268
>
>
> Further, site-fidelity (SameSite=0)  of "bird" who was successful in
> breeding in the previous season (BreedSuc1=1), corresponds to a probability
> of 72%:
> plogis(-0.2135 + 1.1831)
> [1] 0.7250398
>
>
> Is my interpretation correct? So this output is always based on SameSite=0?
> (SameSite=0 means that birds did NOT change nest site, i.e. they show high
> site-fidelity). But what if SameSite=1 would mean high site-fidelity (no
> change of nests?)?
>
> 4. if I don't drop the term "Sex":
>
> summary(fm)
>
> Generalized linear mixed model fit by the Laplace approximation
> Formula: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
>    AIC   BIC logLik deviance
>  77.38 85.34 -34.69    69.38
> Random effects:
>  Groups Name        Variance     Std.Dev.
>  Bird   (Intercept)     0.14080       0.37524
> Number of obs: 54, groups: Bird, 46
>
> Fixed effects:
>                     Estimate         Std. Error     z value     Pr(>|z|)
> (Intercept)     -0.3294         0.4890         -0.674     0.5006
> BreedSuc11   1.1988          0.5957        2.012      0.0442 *
> Sex    M             0.2215           0.5877           0.377      0.7062
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Correlation of Fixed Effects:
>            (Intr)             BrdS11
> BreedSuc11     -0.536
> SexM           -0.628      0.065
>
>
> site fidelity for a female bird (i.e. SexM is 0) who was unsuccessful in
> breeding the previous season (i.e.
> BreedSuc1 is 0) is -0.3294, corresponding to a probability of about 42%
>> plogis(-0.3294)
> [1] 0.4183866
>
> The log-odds of site fidelity for a female bird who was successful in
> breeding is -0.3294 + 1.1988, corresponding to a probability of about 70%
>> plogis(-0.3294 + 1.1988)
> [1] 0.7046208
>
>
> 5. The z-value: Sorry, I still have some trouble with this value....
>
> In model fm (without "Sex") the z-value of BreedSuc1 corresponds to 1.998.
>
> In model fm1("Sex" included) the z-value corresponds to 2.012.
>
> Nearly the same value in both models. But what can someone conclude from: p<
> 0.05, z=1.998 ??? Because this is what many people write in their result
> section (I was told to do so the same....).
>
> "You should see a LRT test statistic close to, but not exactly the same as,
> the square of the z value when you compare the models with and without that
> term.
> This is the sense in which the z-value is an approximation".
>
> I don't really understand how the z-value can be seen as an approximation?
> Am I missing some background knowledge here?
>
> Again, thanks heaps for your help!!!!
>
> Julia
>
>
>
>
>
>
>
> 2011/2/19 Douglas Bates <bates at stat.wisc.edu>
>>
>> Thank you for your questions and for transferring the discussion to
>> the R-SIG-Mixed-Models mailing list, as we had discussed.  I have also
>> copied the mailing list for a class on mixed-effects models that I am
>> teaching.
>>
>> I particularly appreciate your desire to learn about the model instead
>> of just quoting a p-value.  I often lament to my classes that
>> statisticians have been far too successful in propagating the idea of
>> p-values, to the extent that some researchers believe that is all that
>> is needed to learn about an analysis.
>>
>> On Sat, Feb 19, 2011 at 3:05 AM, Julia Sommerfeld
>> <Julia.Sommerfeld at utas.edu.au> wrote:
>> > Dear Douglas and list members,
>> >
>> > Apologies in advance if you might consider my questions as too simple to
>> > be asking the godfather of lme4 for an answer...thus, please feel free
>> > to
>> > ignore my email or to forward it to someone else.
>> >
>> > I'm a PhD student (Australia/Germany) working on tropical seabirds. As
>> > many of my PhD-collegues, I'm having some difficulties with the analysis
>> > of my data using lmer (family=binomial). While some say: What do you
>> > care
>> >
>> > about all the other values as long as you've got a p-value... I do
>> > believe
>> > that it is essential to understand WHAT I'm doing here and WHAT all
>> > these
>> > numbers/values mean.
>> >
>> > I've read the Chapters (lme4 Book Chapters) and publications about the
>> > use
>> > of lmer and searched the forums - but I don't find a satisfying answer.
>> > And I have the feeling that 1. the statistic lecture at my university
>> > was
>> > a joke (sad to say this) 2. that I need a huge statistical/mathematical
>> > background to fully understand GLMMs.
>> >
>> >
>> > One of the question I would like to answer is:
>> > Does the previous breeding success influences nest site fidelity?
>> >
>> > I have binomial data:
>> > SameSite=1 means birds use the same site
>> >
>> > SameSite=0 means birds change nest site
>> >
>> > BreedSuc1=1 Birds were successful in previous breeding season
>> > BreedSuc1=0 Birds were not successful "     "        "
>> >
>> > Sex= male, female
>> > Bird= Bird ID
>> >
>> > This is my model:
>>
>> > fm<-lmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
>>
>> > where Bird is my random factor (same birds were sampled more than once)
>>
>> One thing to note is that there are 46 different birds in the 54
>> observations.  Most birds will have just one observation so a random
>> effect for bird may not be necessary.
>>
>> > summary(fm)
>> >
>> > Generalized linear mixed model fit by the Laplace approximation
>> >
>> > Formula: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
>> >   AIC   BIC logLik deviance
>> >  77.38 85.34 -34.69    69.38
>> > Random effects:
>> >  Groups Name        Variance Std.Dev.
>> >  Bird   (Intercept) 0.14080  0.37524
>> > Number of obs: 54, groups: Bird, 46
>> >
>> > Fixed effects:
>> >            Estimate Std. Error z value Pr(>|z|)
>> > (Intercept)  -0.3294     0.4890  -0.674   0.5006
>> > BreedSuc11    1.1988     0.5957   2.012   0.0442 *
>> > SexM          0.2215     0.5877   0.377   0.7062
>>
>> this suggests that sex is not an important factor in the model.  The
>> (Intercept) term is close to zero, relative to its standard error, but
>> we would retain it in the model as explained below.
>>
>> > ---
>> > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> >
>> > Correlation of Fixed Effects:
>> >           (Intr) BrdS11
>> > BreedSuc11 -0.536
>> > SexM       -0.628  0.065
>> >
>> >
>> > From this summary output I do understand that the Breeding Success has a
>> > significant effect on nest-site fidelity (p<0.05).
>>
>> Yes, but ... this p-value should be used as a guide only.  As
>> described below a p-value must be viewed in context.  It is not a
>> property of the Breeding Success factor; it comes from a comparison of
>> two models and we should bear in mind that these models are before
>> interpreting this number.
>>
>> The interpretation of a p-value for a particular coefficient is that
>> it is an approximation to the p-value we would get from comparing the
>> model that has been fit to the mode fit without this particular
>> coefficient.  In this case the coefficient corresponds to one of the
>> terms in the model and I would advocate performing a likelihood ratio
>> test comparing the two models
>>
>> fm <- glmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
>> fm0 <- glmer(SameSite~Sex+(1|Bird), family="binomial")  # the null
>> hypothesis model
>> anova(fm0, fm)
>>
>> Even though the function is called anova it will, in this case,
>> perform a likelihood ratio test (LRT).  It also prints the values of
>> AIC and BIC if you prefer to compare models according to one of those
>> criteria but I prefer using the likelihood ratio for nested models.
>>
>> However, before doing that comparison you should ask yourself whether
>> you want to compare models that have the, apparently unnecessary term
>> for Sex in them.  The way I would approach the model building is first
>> to reduce the model to
>>
>> fm1 <- lmer(SameSite~BreedSuc1+(1|Bird), family="binomial")
>>
>> You could then compare
>>
>> anova(fm1, fm)
>>
>> which I presume will give a large p-value for the LRT, so we prefer
>> the simpler model, fm1.  After that, I would compare
>>
>> fm2 <- lmer(SameSite ~ 1 + (1|Bird), family="binomial")
>> anova(fm2, fm1)
>>
>> to see if the BreedSuc1 factor is an important predictor in its own right.
>>
>> Note that we don't drop the implicit "(Intercept)" term, even though
>> it has a high p-value in the coefficient table.  The reason is that
>> the interpretation of the (Intercept) coefficient depends on the
>> coding of BreedSuc1.
>>
>> In model fm, the log-odds of site fidelity for a female bird (i.e.
>> SexM is 0) who was unsuccessful in breeding the previous season (i.e.
>> BreedSuc1 is 0) is -0.3294, corresponding to a probability of about
>> 42%
>>
>> > plogis(-0.3294)
>> [1] 0.4183866
>>
>> The log-odds of site fidelity for a female bird who was successful in
>> breeding is -0.3294 + 1.1988, corresponding to a probability of about
>> 70%
>>
>> > plogis(-0.3294 + 1.1988)
>> [1] 0.7046208
>>
>> If you had reversed the meaning of BreedSuc to BreedFail, where 0
>> indicates no failure at breeding and 1 indicates failure, then the
>> coefficient would change sign (i.e. the coefficient for BreedFail
>> would be -1.1988) and the intercept would change to
>>
>> > -0.3294 + 1.1988
>> [1] 0.8694
>>
>> because the reference level would now be a female bird who was
>> successful in breeding.
>>
>> Because the interpretation of the intercept depends upon the coding of
>> other factors, we retain it in the model whenever other terms are
>> retained.
>>
>>
>>
>> > But what else can I conclude from this model?
>> >
>> > Questions:
>> >
>> > 1.Random effects: What does the Random Effect table - the Variance, Std.
>> > Dev. and Intercept - tells me: Is there a random effect that my model
>> > has
>> > to account for?
>>
>> First I would remove the apparently unnecessary Sex term then,
>> ideally, I would check by comparing the fit of the reduced model to
>> that of a GLM without the random effect for Bird.  Unfortunately, I
>> don't think the definition of deviance for a glm fit is compatible
>> with that for a model fit by glmer.  This is something we will need to
>> fix.  For the time being I would instead examine the "caterpillar
>> plot" obtained with
>>
>> dotplot(ranef(fm1, postVar=TRUE))
>>
>> which represent the 95% prediction intervals for each of the birds.
>> If these all overlap zero comfortably I would conclude that the random
>> effect is not needed an fit a glm without a random effect for bird.
>> > Random effects:
>> >  Groups Name        Variance Std.Dev.
>> >  Bird   (Intercept) 0.14080  0.37524
>> > Number of obs: 54, groups: Bird, 46
>>
>> That estimated standard deviation is fairly large. We would expect a
>> range of contributions on the log-odds scale of about +/- 2 sd which,
>> at this point of the logistic curve corresponds to considerable
>> variability in predicted probabilities for birds with the same
>> characteristics.
>>
>> > 2. Fixed Effects: Again the Intercept? Not sure if I understand the
>> > meaning of it (sorry, explanation in Chapter I also doesn't help much)
>>
>> Actually in this model it is a bit different from the models described
>> in chapter 1.  I hope the explanation above makes sense.  Think of it
>> as the log-odds of site fidelity for a bird in the "reference group"
>> where reference group means that all the other variables are a the
>> zero level.
>>
>> > Fixed effects:
>> >            Estimate Std. Error z value Pr(>|z|)
>> > (Intercept)  -0.3294     0.4890  -0.674   0.5006
>> > BreedSuc11    1.1988     0.5957   2.012   0.0442 *
>> >
>> > SexM          0.2215     0.5877   0.377   0.7062
>> >
>> > 3. Meaning of the z-value? Why shall I mention it in te result section?
>>
>> I would regard the z-value as an approximation.  The quantity of
>> interest is the likelihood ratio test statistic which has a
>> chi-squared distribution under the null hypothesis (i.e. the term can
>> be deleted from the model without getting a significantly worse fit).
>> It happens that this would be a chi-squared distribution with 1 degree
>> of freedom, which corresponds to the square of a standard normal
>> distribution.  You should see a LRT test statistic close to, but not
>> exactly the same as, the square of the z value when you compare the
>> models with and without that term.  This is the sense in which the
>> z-value is an approximation.  To me the LRT statistic is more reliable
>> because it is based upon actually refitting the model.
>>
>> > 4. Estimate and Std. Error of the fixed effects? How can I tell from
>> > these
>> > values WHAT kind of effect (positiv, negativ?) these parameter have on
>> > nest-site fidelity? Do birds that were successful during the previous
>> > breeding success show a higher nest-site fidelity? Remember, I have
>> > binomial data...
>>
>> That is described above.  If you want the estimate of the site
>> fidelity for bird with certain characteristics you evaluate the
>> corresponding combination of coefficients and apply plogis to the
>> result.
>> > I would highly appreciate your feedback and/or suggestions of
>> >
>> > papers/chapters I could read for a better understanding of the output.
>> >
>> > Best regards,
>> >
>> >
>> > Julia
>>
>> I hope this helps.
>
>
>
>
>
> --
> Julia Sommerfeld - PhD Candidate
> Institute for Marine and Antarctic Studies
> University of Tasmania
> Private Bag 129, Hobart
> TAS 7001
>
> Phone: +61 458 247 348
> Email: julia.somma at gmx.de
> Julia.Sommerfeld at utas.edu.au
>




More information about the R-sig-mixed-models mailing list