[R-sig-ME] Interpretation of lmer output in R
Douglas Bates
bates at stat.wisc.edu
Mon Feb 21 20:48:23 CET 2011
Actually it only went to me because you sent the original to
r-sig-mixed-models-ownder at r-project.org, not to the list itself.
On Mon, Feb 21, 2011 at 1:44 PM, Julia Sommerfeld
<Julia.Sommerfeld at utas.edu.au> wrote:
>
> not sure if my email came through...
>
> -------
>
>
> Dear Douglas and list member,
>
> Thank you heaps for your answers. The interpretation of the summary output
> (lmer) is becoming much clearer now. I have to admit I had a slightly (not
> to say HUGE) different idea of the summary output.
>
> But a few questions still remain...
>
> I have tried out the suggestions with the following results:
>
> 1. I tested if "Sex" is an important factor in the model:
>
> fm<-lmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
> fm1<-lmer(SameSite~BreedSuc1+(1|Bird), family="binomial")
> anova(fm1,fm)
>
> Data:
> Models:
> fm1: SameSite ~ BreedSuc1 + (1 | Bird)
> fm: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
> Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
> fm1 3 75.518 81.485 -34.759
> fm 4 77.379 85.335 -34.690 0.1387 1 0.7096
>
>
> 2. Since Sex is not "important", I compared fm1 with fm2 to test if
> BreedSuc1 is an important factor:
>
> fm2<-lmer(SameSite ~ 1 + (1|Bird), family="binomial")
> anova(fm2,fm1)
>
> Data:
> Models:
> fm2: SameSite ~ 1 + (1 | Bird)
> fm1: SameSite ~ BreedSuc1 + (1 | Bird)
> Df AIC BIC logLik Chisq Chi Df
> Pr(>Chisq)
> fm2 2 77.617 81.595 -36.808
> fm1 3 75.518 81.485 -34.759 4.0991 1
> 0.04291 *
>
>
> BUT: Ben (thanks for the input) disagreed here:
> "I'm afraid I have to disagree with Doug here. This kind of model reduction,
> while seemingly sensible (and not as abusive as large-scale, automated
> stepwise regression), is a mild form of data snooping. Don't do it ..."
>
> If I don't do it, what could be an alternative? Simply rely on the output of
> fm1 without looking at fm2?
>
>
> 3. From the output above, I conclude that BreedSuc1 has an effect on
> SameSite:
>
> summary(fm1)
>
> Generalized linear mixed model fit by the Laplace approximation
> Formula: SameSite ~ BreedSuc1 + (1 | Bird)
> AIC BIC logLik deviance
> 75.52 81.48 -34.76 69.52
> Random effects:
> Groups Name Variance Std.Dev.
> Bird (Intercept) 0.12332 0.35117
> Number of obs: 54, groups: Bird, 46
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -0.2135 0.3794 -0.563 0.5736
> BreedSuc11 1.1831 0.5921 1.998 0.0457 *
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Correlation of Fixed Effects:
> (Intr)
> BreedSuc11 -0.638
>
>
> Now, again the interpretation of the Fixed effects:
>
> BreedSuc1 has a significant effect on SameSite.
>
> #site-fidelity (SameSite=0) of "bird" (bird(!), since I droppped "Sex"?)
> who was unsuccessful in breeding in the previous season (BreedSuc1=0),
> corresponds to a probability of 44%:
> plogis(-0.2135)
> [1] 0.4468268
>
>
> Further, site-fidelity (SameSite=0) of "bird" who was successful in
> breeding in the previous season (BreedSuc1=1), corresponds to a probability
> of 72%:
> plogis(-0.2135 + 1.1831)
> [1] 0.7250398
>
>
> Is my interpretation correct? So this output is always based on SameSite=0?
> (SameSite=0 means that birds did NOT change nest site, i.e. they show high
> site-fidelity). But what if SameSite=1 would mean high site-fidelity (no
> change of nests?)?
>
> 4. if I don't drop the term "Sex":
>
> summary(fm)
>
> Generalized linear mixed model fit by the Laplace approximation
> Formula: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
> AIC BIC logLik deviance
> 77.38 85.34 -34.69 69.38
> Random effects:
> Groups Name Variance Std.Dev.
> Bird (Intercept) 0.14080 0.37524
> Number of obs: 54, groups: Bird, 46
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -0.3294 0.4890 -0.674 0.5006
> BreedSuc11 1.1988 0.5957 2.012 0.0442 *
> Sex M 0.2215 0.5877 0.377 0.7062
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Correlation of Fixed Effects:
> (Intr) BrdS11
> BreedSuc11 -0.536
> SexM -0.628 0.065
>
>
> site fidelity for a female bird (i.e. SexM is 0) who was unsuccessful in
> breeding the previous season (i.e.
> BreedSuc1 is 0) is -0.3294, corresponding to a probability of about 42%
>> plogis(-0.3294)
> [1] 0.4183866
>
> The log-odds of site fidelity for a female bird who was successful in
> breeding is -0.3294 + 1.1988, corresponding to a probability of about 70%
>> plogis(-0.3294 + 1.1988)
> [1] 0.7046208
>
>
> 5. The z-value: Sorry, I still have some trouble with this value....
>
> In model fm (without "Sex") the z-value of BreedSuc1 corresponds to 1.998.
>
> In model fm1("Sex" included) the z-value corresponds to 2.012.
>
> Nearly the same value in both models. But what can someone conclude from: p<
> 0.05, z=1.998 ??? Because this is what many people write in their result
> section (I was told to do so the same....).
>
> "You should see a LRT test statistic close to, but not exactly the same as,
> the square of the z value when you compare the models with and without that
> term.
> This is the sense in which the z-value is an approximation".
>
> I don't really understand how the z-value can be seen as an approximation?
> Am I missing some background knowledge here?
>
> Again, thanks heaps for your help!!!!
>
> Julia
>
>
>
>
>
>
>
> 2011/2/19 Douglas Bates <bates at stat.wisc.edu>
>>
>> Thank you for your questions and for transferring the discussion to
>> the R-SIG-Mixed-Models mailing list, as we had discussed. I have also
>> copied the mailing list for a class on mixed-effects models that I am
>> teaching.
>>
>> I particularly appreciate your desire to learn about the model instead
>> of just quoting a p-value. I often lament to my classes that
>> statisticians have been far too successful in propagating the idea of
>> p-values, to the extent that some researchers believe that is all that
>> is needed to learn about an analysis.
>>
>> On Sat, Feb 19, 2011 at 3:05 AM, Julia Sommerfeld
>> <Julia.Sommerfeld at utas.edu.au> wrote:
>> > Dear Douglas and list members,
>> >
>> > Apologies in advance if you might consider my questions as too simple to
>> > be asking the godfather of lme4 for an answer...thus, please feel free
>> > to
>> > ignore my email or to forward it to someone else.
>> >
>> > I'm a PhD student (Australia/Germany) working on tropical seabirds. As
>> > many of my PhD-collegues, I'm having some difficulties with the analysis
>> > of my data using lmer (family=binomial). While some say: What do you
>> > care
>> >
>> > about all the other values as long as you've got a p-value... I do
>> > believe
>> > that it is essential to understand WHAT I'm doing here and WHAT all
>> > these
>> > numbers/values mean.
>> >
>> > I've read the Chapters (lme4 Book Chapters) and publications about the
>> > use
>> > of lmer and searched the forums - but I don't find a satisfying answer.
>> > And I have the feeling that 1. the statistic lecture at my university
>> > was
>> > a joke (sad to say this) 2. that I need a huge statistical/mathematical
>> > background to fully understand GLMMs.
>> >
>> >
>> > One of the question I would like to answer is:
>> > Does the previous breeding success influences nest site fidelity?
>> >
>> > I have binomial data:
>> > SameSite=1 means birds use the same site
>> >
>> > SameSite=0 means birds change nest site
>> >
>> > BreedSuc1=1 Birds were successful in previous breeding season
>> > BreedSuc1=0 Birds were not successful " " "
>> >
>> > Sex= male, female
>> > Bird= Bird ID
>> >
>> > This is my model:
>>
>> > fm<-lmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
>>
>> > where Bird is my random factor (same birds were sampled more than once)
>>
>> One thing to note is that there are 46 different birds in the 54
>> observations. Most birds will have just one observation so a random
>> effect for bird may not be necessary.
>>
>> > summary(fm)
>> >
>> > Generalized linear mixed model fit by the Laplace approximation
>> >
>> > Formula: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
>> > AIC BIC logLik deviance
>> > 77.38 85.34 -34.69 69.38
>> > Random effects:
>> > Groups Name Variance Std.Dev.
>> > Bird (Intercept) 0.14080 0.37524
>> > Number of obs: 54, groups: Bird, 46
>> >
>> > Fixed effects:
>> > Estimate Std. Error z value Pr(>|z|)
>> > (Intercept) -0.3294 0.4890 -0.674 0.5006
>> > BreedSuc11 1.1988 0.5957 2.012 0.0442 *
>> > SexM 0.2215 0.5877 0.377 0.7062
>>
>> this suggests that sex is not an important factor in the model. The
>> (Intercept) term is close to zero, relative to its standard error, but
>> we would retain it in the model as explained below.
>>
>> > ---
>> > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> >
>> > Correlation of Fixed Effects:
>> > (Intr) BrdS11
>> > BreedSuc11 -0.536
>> > SexM -0.628 0.065
>> >
>> >
>> > From this summary output I do understand that the Breeding Success has a
>> > significant effect on nest-site fidelity (p<0.05).
>>
>> Yes, but ... this p-value should be used as a guide only. As
>> described below a p-value must be viewed in context. It is not a
>> property of the Breeding Success factor; it comes from a comparison of
>> two models and we should bear in mind that these models are before
>> interpreting this number.
>>
>> The interpretation of a p-value for a particular coefficient is that
>> it is an approximation to the p-value we would get from comparing the
>> model that has been fit to the mode fit without this particular
>> coefficient. In this case the coefficient corresponds to one of the
>> terms in the model and I would advocate performing a likelihood ratio
>> test comparing the two models
>>
>> fm <- glmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
>> fm0 <- glmer(SameSite~Sex+(1|Bird), family="binomial") # the null
>> hypothesis model
>> anova(fm0, fm)
>>
>> Even though the function is called anova it will, in this case,
>> perform a likelihood ratio test (LRT). It also prints the values of
>> AIC and BIC if you prefer to compare models according to one of those
>> criteria but I prefer using the likelihood ratio for nested models.
>>
>> However, before doing that comparison you should ask yourself whether
>> you want to compare models that have the, apparently unnecessary term
>> for Sex in them. The way I would approach the model building is first
>> to reduce the model to
>>
>> fm1 <- lmer(SameSite~BreedSuc1+(1|Bird), family="binomial")
>>
>> You could then compare
>>
>> anova(fm1, fm)
>>
>> which I presume will give a large p-value for the LRT, so we prefer
>> the simpler model, fm1. After that, I would compare
>>
>> fm2 <- lmer(SameSite ~ 1 + (1|Bird), family="binomial")
>> anova(fm2, fm1)
>>
>> to see if the BreedSuc1 factor is an important predictor in its own right.
>>
>> Note that we don't drop the implicit "(Intercept)" term, even though
>> it has a high p-value in the coefficient table. The reason is that
>> the interpretation of the (Intercept) coefficient depends on the
>> coding of BreedSuc1.
>>
>> In model fm, the log-odds of site fidelity for a female bird (i.e.
>> SexM is 0) who was unsuccessful in breeding the previous season (i.e.
>> BreedSuc1 is 0) is -0.3294, corresponding to a probability of about
>> 42%
>>
>> > plogis(-0.3294)
>> [1] 0.4183866
>>
>> The log-odds of site fidelity for a female bird who was successful in
>> breeding is -0.3294 + 1.1988, corresponding to a probability of about
>> 70%
>>
>> > plogis(-0.3294 + 1.1988)
>> [1] 0.7046208
>>
>> If you had reversed the meaning of BreedSuc to BreedFail, where 0
>> indicates no failure at breeding and 1 indicates failure, then the
>> coefficient would change sign (i.e. the coefficient for BreedFail
>> would be -1.1988) and the intercept would change to
>>
>> > -0.3294 + 1.1988
>> [1] 0.8694
>>
>> because the reference level would now be a female bird who was
>> successful in breeding.
>>
>> Because the interpretation of the intercept depends upon the coding of
>> other factors, we retain it in the model whenever other terms are
>> retained.
>>
>>
>>
>> > But what else can I conclude from this model?
>> >
>> > Questions:
>> >
>> > 1.Random effects: What does the Random Effect table - the Variance, Std.
>> > Dev. and Intercept - tells me: Is there a random effect that my model
>> > has
>> > to account for?
>>
>> First I would remove the apparently unnecessary Sex term then,
>> ideally, I would check by comparing the fit of the reduced model to
>> that of a GLM without the random effect for Bird. Unfortunately, I
>> don't think the definition of deviance for a glm fit is compatible
>> with that for a model fit by glmer. This is something we will need to
>> fix. For the time being I would instead examine the "caterpillar
>> plot" obtained with
>>
>> dotplot(ranef(fm1, postVar=TRUE))
>>
>> which represent the 95% prediction intervals for each of the birds.
>> If these all overlap zero comfortably I would conclude that the random
>> effect is not needed an fit a glm without a random effect for bird.
>> > Random effects:
>> > Groups Name Variance Std.Dev.
>> > Bird (Intercept) 0.14080 0.37524
>> > Number of obs: 54, groups: Bird, 46
>>
>> That estimated standard deviation is fairly large. We would expect a
>> range of contributions on the log-odds scale of about +/- 2 sd which,
>> at this point of the logistic curve corresponds to considerable
>> variability in predicted probabilities for birds with the same
>> characteristics.
>>
>> > 2. Fixed Effects: Again the Intercept? Not sure if I understand the
>> > meaning of it (sorry, explanation in Chapter I also doesn't help much)
>>
>> Actually in this model it is a bit different from the models described
>> in chapter 1. I hope the explanation above makes sense. Think of it
>> as the log-odds of site fidelity for a bird in the "reference group"
>> where reference group means that all the other variables are a the
>> zero level.
>>
>> > Fixed effects:
>> > Estimate Std. Error z value Pr(>|z|)
>> > (Intercept) -0.3294 0.4890 -0.674 0.5006
>> > BreedSuc11 1.1988 0.5957 2.012 0.0442 *
>> >
>> > SexM 0.2215 0.5877 0.377 0.7062
>> >
>> > 3. Meaning of the z-value? Why shall I mention it in te result section?
>>
>> I would regard the z-value as an approximation. The quantity of
>> interest is the likelihood ratio test statistic which has a
>> chi-squared distribution under the null hypothesis (i.e. the term can
>> be deleted from the model without getting a significantly worse fit).
>> It happens that this would be a chi-squared distribution with 1 degree
>> of freedom, which corresponds to the square of a standard normal
>> distribution. You should see a LRT test statistic close to, but not
>> exactly the same as, the square of the z value when you compare the
>> models with and without that term. This is the sense in which the
>> z-value is an approximation. To me the LRT statistic is more reliable
>> because it is based upon actually refitting the model.
>>
>> > 4. Estimate and Std. Error of the fixed effects? How can I tell from
>> > these
>> > values WHAT kind of effect (positiv, negativ?) these parameter have on
>> > nest-site fidelity? Do birds that were successful during the previous
>> > breeding success show a higher nest-site fidelity? Remember, I have
>> > binomial data...
>>
>> That is described above. If you want the estimate of the site
>> fidelity for bird with certain characteristics you evaluate the
>> corresponding combination of coefficients and apply plogis to the
>> result.
>> > I would highly appreciate your feedback and/or suggestions of
>> >
>> > papers/chapters I could read for a better understanding of the output.
>> >
>> > Best regards,
>> >
>> >
>> > Julia
>>
>> I hope this helps.
>
>
>
>
>
> --
> Julia Sommerfeld - PhD Candidate
> Institute for Marine and Antarctic Studies
> University of Tasmania
> Private Bag 129, Hobart
> TAS 7001
>
> Phone: +61 458 247 348
> Email: julia.somma at gmx.de
> Julia.Sommerfeld at utas.edu.au
>
More information about the R-sig-mixed-models
mailing list