> [I had started to answer this when Doug's answer came through, so I
> will interleave my answers where they expand or differ ...]
>
> On 11-02-19 10:04 AM, Douglas Bates wrote:
>> > Thank you for your questions and for transferring the discussion to
>> > the R-SIG-Mixed-Models mailing list, as we had discussed. I have also
>> > copied the mailing list for a class on mixed-effects models that I am
>> > teaching.
>> >
>> > I particularly appreciate your desire to learn about the model instead
>> > of just quoting a p-value. I often lament to my classes that
>> > statisticians have been far too successful in propagating the idea of
>> > p-values, to the extent that some researchers believe that is all that
>> > is needed to learn about an analysis.
>> >
>> > On Sat, Feb 19, 2011 at 3:05 AM, Julia Sommerfeld
>> > wrote:
>>> >> Dear Douglas and list members,
>>> >>
>>> >> Apologies in advance if you might consider my questions as too simple to
>>> >> be asking the godfather of lme4 for an answer...thus, please feel free to
>>> >> ignore my email or to forward it to someone else.
>>> >>
>>> >> I'm a PhD student (Australia/Germany) working on tropical seabirds. As
>>> >> many of my PhD-collegues, I'm having some difficulties with the analysis
>>> >> of my data using lmer (family=binomial). While some say: What do you care
>>> >>
>>> >> about all the other values as long as you've got a p-value... I do believe
>>> >> that it is essential to understand WHAT I'm doing here and WHAT all these
>>> >> numbers/values mean.
>>> >>
>>> >> I've read the Chapters (lme4 Book Chapters) and publications about the use
>>> >> of lmer and searched the forums - but I don't find a satisfying answer.
>>> >> And I have the feeling that 1. the statistic lecture at my university was
>>> >> a joke (sad to say this) 2. that I need a huge statistical/mathematical
>>> >> background to fully understand GLMMs.
> If you have a reasonably thorough understanding of both linear mixed
> models (LMMs) and generalized linear models (GLMs) you can usually
> triangulate to get a good idea of what GLMMs are doing. I would
> strongly recommend Zuur et al's book on mixed models: I don't agree with
> absolutely everything says, but it's the most accessible treatment for
> ecologists that I have seen. Faraway 2006 is pretty good too (although
> Zuur covers many more of the complexities that ecologists end up dealing
> with).
>
>
>>> >> One of the question I would like to answer is:
>>> >> Does the previous breeding success influences nest site fidelity?
>>> >>
>>> >> I have binomial data:
>>> >> SameSite=1 means birds use the same site
>>> >>
>>> >> SameSite=0 means birds change nest site
>>> >>
>>> >> BreedSuc1=1 Birds were successful in previous breeding season
>>> >> BreedSuc1=0 Birds were not successful " " "
>>> >>
>>> >> Sex= male, female
>>> >> Bird= Bird ID
>>> >>
>>> >> This is my model:
>> >
>>> >> fm<-lmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
>> >
>>> >> where Bird is my random factor (same birds were sampled more than once)
>> >
>> > One thing to note is that there are 46 different birds in the 54
>> > observations. Most birds will have just one observation so a random
>> > effect for bird may not be necessary.
>> >
>>> >> summary(fm)
>>> >>
>>> >> Generalized linear mixed model fit by the Laplace approximation
>>> >>
>>> >> Formula: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
>>> >> AIC BIC logLik deviance
>>> >> 77.38 85.34 -34.69 69.38
>>> >> Random effects:
>>> >> Groups Name Variance Std.Dev.
>>> >> Bird (Intercept) 0.14080 0.37524
>>> >> Number of obs: 54, groups: Bird, 46
>>> >>
>>> >> Fixed effects:
>>> >> Estimate Std. Error z value Pr(>|z|)
>>> >> (Intercept) -0.3294 0.4890 -0.674 0.5006
>>> >> BreedSuc11 1.1988 0.5957 2.012 0.0442 *
>>> >> SexM 0.2215 0.5877 0.377 0.7062
>> >
>> > this suggests that sex is not an important factor in the model. The
>> > (Intercept) term is close to zero, relative to its standard error, but
>> > we would retain it in the model as explained below.
>> >
>>> >> ---
>>> >> Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>>> >>
>>> >> Correlation of Fixed Effects:
>>> >> (Intr) BrdS11
>>> >> BreedSuc11 -0.536
>>> >> SexM -0.628 0.065
>>> >>
>>> >>
>>> >> From this summary output I do understand that the Breeding Success has a
>>> >> significant effect on nest-site fidelity (p<0.05).
>> >
>> > Yes, but ... this p-value should be used as a guide only. As
>> > described below a p-value must be viewed in context. It is not a
>> > property of the Breeding Success factor; it comes from a comparison of
>> > two models and we should bear in mind that these models are before
>> > interpreting this number.
>> >
>> > The interpretation of a p-value for a particular coefficient is that
>> > it is an approximation to the p-value we would get from comparing the
>> > model that has been fit to the mode fit without this particular
>> > coefficient. In this case the coefficient corresponds to one of the
>> > terms in the model and I would advocate performing a likelihood ratio
>> > test comparing the two models
>> >
>> > fm<- glmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
>> > fm0<- glmer(SameSite~Sex+(1|Bird), family="binomial") # the null
>> > hypothesis model
>> > anova(fm0, fm)
>> >
>> > Even though the function is called anova it will, in this case,
>> > perform a likelihood ratio test (LRT). It also prints the values of
>> > AIC and BIC if you prefer to compare models according to one of those
>> > criteria but I prefer using the likelihood ratio for nested models.
> Yes, although even the likelihood ratio test (although better than
> the default Wald test printed by summary()) is an approximation based on
> large sample sizes; if you're interested in seeing the effect of this
> approximation, you can try the methods shown in the examples of the
> "simulate-mer" help page on the latest release of lme4 (should be on
> CRAN now)
Ok. So LRT is considered the most appropriate method at the moment? It
seems like Wald was the suggested method in your TREE paper (2009), or
have I missed something?
Is Wald considered less conservative? Although I get slightly higher
P-values with Wald compared to LRT (Poisson, no overdispersion, rather
big sample size).
>> >
>> > However, before doing that comparison you should ask yourself whether
>> > you want to compare models that have the, apparently unnecessary term
>> > for Sex in them. The way I would approach the model building is first
>> > to reduce the model to
>> >
>> > fm1<- lmer(SameSite~BreedSuc1+(1|Bird), family="binomial")
>> >
>> > You could then compare
>> >
>> > anova(fm1, fm)
>> >
>> > which I presume will give a large p-value for the LRT, so we prefer
>> > the simpler model, fm1. After that, I would compare
>> >
>> > fm2<- lmer(SameSite ~ 1 + (1|Bird), family="binomial")
>> > anova(fm2, fm1)
>> >
>> > to see if the BreedSuc1 factor is an important predictor in its own right.
>> >
>> > Note that we don't drop the implicit "(Intercept)" term, even though
>> > it has a high p-value in the coefficient table. The reason is that
>> > the interpretation of the (Intercept) coefficient depends on the
>> > coding of BreedSuc1.
> I'm afraid I have to disagree with Doug here. This kind of model
> reduction, while seemingly sensible (and not as abusive as large-scale,
> automated stepwise regression), is a mild form of data snooping. Don't
> do it ...
>
>> >
>> > In model fm, the log-odds of site fidelity for a female bird (i.e.
>> > SexM is 0) who was unsuccessful in breeding the previous season (i.e.
>> > BreedSuc1 is 0) is -0.3294, corresponding to a probability of about
>> > 42%
>> >
>>> >> plogis(-0.3294)
>> > [1] 0.4183866
>> >
>> > The log-odds of site fidelity for a female bird who was successful in
>> > breeding is -0.3294 + 1.1988, corresponding to a probability of about
>> > 70%
>> >
>>> >> plogis(-0.3294 + 1.1988)
>> > [1] 0.7046208
>> >
>> > If you had reversed the meaning of BreedSuc to BreedFail, where 0
>> > indicates no failure at breeding and 1 indicates failure, then the
>> > coefficient would change sign (i.e. the coefficient for BreedFail
>> > would be -1.1988) and the intercept would change to
>> >
>>> >> -0.3294 + 1.1988
>> > [1] 0.8694
>> >
>> > because the reference level would now be a female bird who was
>> > successful in breeding.
>> >
>> > Because the interpretation of the intercept depends upon the coding of
>> > other factors, we retain it in the model whenever other terms are
>> > retained.
>> >
>> >
>> >
>>> >> But what else can I conclude from this model?
>>> >>
>>> >> Questions:
>>> >>
>>> >> 1.Random effects: What does the Random Effect table - the Variance, Std.
>>> >> Dev. and Intercept - tells me: Is there a random effect that my model has
>>> >> to account for?
>> >
>> > First I would remove the apparently unnecessary Sex term then,
>> > ideally, I would check by comparing the fit of the reduced model to
>> > that of a GLM without the random effect for Bird. Unfortunately, I
>> > don't think the definition of deviance for a glm fit is compatible
>> > with that for a model fit by glmer. This is something we will need to
>> > fix. For the time being I would instead examine the "caterpillar
>> > plot" obtained with
>> >
>> > dotplot(ranef(fm1, postVar=TRUE))
>> >
>> > which represent the 95% prediction intervals for each of the birds.
>> > If these all overlap zero comfortably I would conclude that the random
>> > effect is not needed an fit a glm without a random effect for bird.
>>> >> Random effects:
>>> >> Groups Name Variance Std.Dev.
>>> >> Bird (Intercept) 0.14080 0.37524
>>> >> Number of obs: 54, groups: Bird, 46
>> >
>> > That estimated standard deviation is fairly large. We would expect a
>> > range of contributions on the log-odds scale of about ± 2 sd which,
>> > at this point of the logistic curve corresponds to considerable
>> > variability in predicted probabilities for birds with the same
>> > characteristics.
> Hmmm. My interpretation is a little bit different here too. I
> wouldn't try to remove it: I would say that there is indeed a fairly
> large remaining variability in the probability that different birds of
> the same sex will stay at the site even given the same breeding history.
> You can compare the magnitude of the standard deviation to the
> magnitudes of the fixed effects -- they're on the same scale -- so the
> between-bird variability is fairly large relative to the difference
> between sexes, but small relative to the effect of breeding history.
>
>> >
>>> >> 2. Fixed Effects: Again the Intercept? Not sure if I understand the
>>> >> meaning of it (sorry, explanation in Chapter I also doesn't help much)
>> >
>> > Actually in this model it is a bit different from the models described
>> > in chapter 1. I hope the explanation above makes sense. Think of it
>> > as the log-odds of site fidelity for a bird in the "reference group"
>> > where reference group means that all the other variables are a the
>> > zero level.
>> >
>>> >> Fixed effects:
>>> >> Estimate Std. Error z value Pr(>|z|)
>>> >> (Intercept) -0.3294 0.4890 -0.674 0.5006
>>> >> BreedSuc11 1.1988 0.5957 2.012 0.0442 *
>>> >>
>>> >> SexM 0.2215 0.5877 0.377 0.7062
>>> >>
>>> >> 3. Meaning of the z-value? Why shall I mention it in te result section?
>> >
>> > I would regard the z-value as an approximation. The quantity of
>> > interest is the likelihood ratio test statistic which has a
>> > chi-squared distribution under the null hypothesis (i.e. the term can
>> > be deleted from the model without getting a significantly worse fit).
>> > It happens that this would be a chi-squared distribution with 1 degree
>> > of freedom, which corresponds to the square of a standard normal
>> > distribution. You should see a LRT test statistic close to, but not
>> > exactly the same as, the square of the z value when you compare the
>> > models with and without that term. This is the sense in which the
>> > z-value is an approximation. To me the LRT statistic is more reliable
>> > because it is based upon actually refitting the model.
>> >
>>> >> 4. Estimate and Std. Error of the fixed effects? How can I tell from these
>>> >> values WHAT kind of effect (positiv, negativ?) these parameter have on
>>> >> nest-site fidelity? Do birds that were successful during the previous
>>> >> breeding success show a higher nest-site fidelity? Remember, I have
>>> >> binomial data...
>> >
>> > That is described above. If you want the estimate of the site
>> > fidelity for bird with certain characteristics you evaluate the
>> > corresponding combination of coefficients and apply plogis to the
>> > result.
>>> >> I would highly appreciate your feedback and/or suggestions of
>>> >> papers/chapters I could read for a better understanding of the output.
>>> >>
[[alternative HTML version deleted]]