[R-sig-ME] Interpretation of GLMM output in R
Lionel
hughes.dupond at gmx.de
Sun Aug 2 13:34:47 CEST 2015
Dear Yvonne,
To answer Question 1, I would scale both predictors to unit standard
deviation (ie fad.data$Volume_std<-scale(fad.data$Volume,scale=TRUE))
and then compare the back-transformed coefficients, as you have a
Poisson model this would be then exp(fixef(fad)). Say that you get a
value of 1.5 for the variable Volume, you would then interpret this as:
an increase in one standard deviation of Volume leads to an increase of
1.5*Pollinator, put differently every increase of one standard deviation
in Volume increase Pollinator by 50%. Having scaled the predictor would
also allow you to compare the strength of their effects. Also note that
effect sizes do not give informations about how much variance in the
response variable is explained by individuals terms.
Concerning the LRT to test the effect of predictors, it is not the best
available option (see http://glmm.wikidot.com/faq#summary) the function
"mixed" in the "afex" package allow you to do parametric bootstrap to
test fixed effects (using type 2 or 3 tests). For effects with very
small p-values (like volume) using LRT or parametric bootstrap will most
certainly lead to the same interpretation (ie volume has a significant
effect), for p-values close to 0.05 (like parasitoids) you might get
different interpretations.
Sincerely yours,
Lionel
On 02/08/2015 11:31, Yvonne Hiller wrote:
> Dear Paul, Thierry and list members,
> first of all:
> Thanks so much for going through my questions, although it came in an
> inappropriate look with wired brackets and unspecified questions. Apologies for that. So I
> try again to specify things.
>
> The model includes following variables:
> Pollinator = pollinator offspring
> Parasitoids = parasitoid offspring
> Volume = measured fig size
> tree = tree ID
>
> This is the output of the model:
>
> fad <-glmer(Pollinator~Parasitoids+volume+(1|tree),data=fad.data,family="poisson",verbose=TRUE)
> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
> Family: poisson ( log )
> Formula: Pollinator ~ Parasitoids + volume + (1 | tree)
> Data: fad.data
>
> AIC BIC logLik deviance df.resid
> 1550.6 1560.2 -771.3 1542.6 76
>
> Scaled residuals:
> Min 1Q Median 3Q Max
> -7.4825 -2.4111 -0.4725 2.4321 9.3409
>
> Random effects:
> Groups Name Variance Std.Dev.
> tree (Intercept) 0.1063 0.3261
> Number of obs: 80, groups: tree, 8
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) 2.3167832 0.1634876 14.171 <2e-16 ***
> Parasitoids -0.0046908 0.0022169 -2.116 0.0344 *
> volume 0.0069527 0.0006197 11.220 <2e-16 ***
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Correlation of Fixed Effects:
> (Intr) Prstds
> Parasitoids -0.144
> volume -0.656 -0.139
>
> 1. I want to know if the predictor variables have an effect on pollinator
> offspring. Therefore, I tested if “Parasitoids” and “volume” are important factors in the model by the LRT.
>
> fad1 <- glmer(Pollinator~1+volume+(1|tree),data=fad.data,family="poisson",verbose=TRUE)
> fad2 <- glmer(Pollinator~Parasitoids+1+(1|tree),data=fad.data,family="poisson",verbose=TRUE)
>
> anova(fad,fad1)
> Models:
> fad1: Pollinator ~ 1 + volume + (1 | tree)
> fad: Pollinator ~ Parasitoids + volume + (1 | tree)
> Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
> fad1 3 1553.1 1560.3 -773.57 1547.1
> fad 4 1550.6 1560.2 -771.32 1542.6 4.5105 1 0.03369 *
> anova(fad,fad2)
> Models:
> fad2: Pollinator ~ Parasitoids + 1 + (1 | tree)
> fad: Pollinator ~ Parasitoids + volume + (1 | tree)
> Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
> fad2 3 1669.2 1676.4 -831.61 1663.2
> fad 4 1550.6 1560.2 -771.32 1542.6 120.59 1 < 2.2e-16 ***
> >From the output, I conclude that both predictor variables have an effect on pollinator offspring. Is it reasonable to do the LRT for testing the predictor variables? Is there any chance to get an effect size of how much the predictor variables explain variance in the response variable? Since I know they have an effect, interpreation would be different when the effect of one variable is small or large.
>
> 2. From suggestions of Thierry and Paul and from the the caterpillar plot:
> dotplot(ranef(fad, condVar=TRUE))$tree
> I do understand that there is indeed a fairly large remaining variability in the number of pollinator offspring among trees, but
> the relative effect of “tree” does not depend on “Parasitoids” and “volume”.
> Again, I would highly appreciate your suggestions and feedback on this. Thank you very much in advance.
> Best wishes,Yvonne
>
> ________________________________
>> Date: Sun, 2 Aug 2015 00:08:35 +0200
>> Subject: Re: [R-sig-ME] Interpretation of GLMM output in R
>> From: thierry.onkelinx at inbo.be
>> To: pauljohn32 at gmail.com
>> CC: yvonne.hiller at hotmail.de; r-sig-mixed-models at r-project.org
>>
>>
>> Paul,
>>
>> Note that exp(beta0 + beta1*parasitoids + btree) is equivalent to
>> exp(beta0 + beta1*parasitoids)*exp(btree). So the relative effect of
>> tree doesn't depend on the other effects.
>>
>> I tend to look at the ratio of the 97.5% and 2.5% quantiles of the
>> random effect. exp(q*sigma)/exp(-q*sigma) or simplified exp(2*q*sigma)
>> with q =1.96 (97.5% quantile of a normal distribution) and sigma= the
>> standard deviation of the random effect. exp(2*1.96*0.3261) ~ 3.5 So
>> the 97.5% quantile tree has about 3.5 times the number of pollinators
>> of the 2.5% quantile tree.
>>
>> Best regards,
>>
>> Thierry
>>
>> Op 1-aug.-2015 15:43 schreef "Paul Johnson"
>> <pauljohn32 at gmail.com<mailto:pauljohn32 at gmail.com>>:
>> Hi
>>
>> Your message comes through with weird line breaks, you should turn off
>> the HTML compose option in your mail program, just write text.
>>
>> On Fri, Jul 31, 2015 at 8:42 AM, Yvonne Hiller
>> <yvonne.hiller at hotmail.de<mailto:yvonne.hiller at hotmail.de>> wrote:
>>>
>>>
>>>
>>> Dear Ben Bolker and list members,
>>>
>>>
>>>
>>> I´am a PhD-student working on tropical plant-pollinator
>>> interactions (the fig-fig wasp mutualism).
>>>
>>> I have some difficulties with my analyses of my data
>>> using lmer (family = Poisson). I have read a lot of threads and
>> searched for
>>> solutions in Zuur 2009, though it was not completely satisfying. I
>> looked at
>>> other papers using GLMM (espeacially for the fig-fig wasp mutualism),
>> but there
>>> were so many different ways in reporting and interpreting p-values,
>> AICs etc.
>>> Therefore, I would be very grateful, if you could go through my output and
>>> answer my questions to hopefully fully understand GLMM. Thank you so
>> much in
>>> advance.
>>>
>>>
>>>
>>> One of my questions is:
>>>
>>> Have parasitoids (offspring) and the volume size an
>>> effect on the number of pollinator offspring?
>>>
>>>
>>>
>>> As pollinators are parasitized by parasitoids, I would
>>> expect a negative impact on pollinator offspring. As the size of the
>> fig fruits
>>> might be an additionally factor (due to limited oviposition sites
>> inside the fruit
>>> or due to the ovipositor length of a parasitoid is to short to reach
>> inner most
>>> ovules galled by pollinators), I included it in the model.
>>>
>>>
>>>
>>> I have poisson data (counts):
>>>
>>>
>>>
>>> Pollinator
>>> = pollinator offspring
>>>
>>> Parasitoids = parasitoid offspring
>>>
>>> volume = fig fruit size measurement
>>>
>>> tree = Tree ID
>>>
>>>
>>>
>>> I have data from trees in the forest and in the
>>> farmland, as well as data of the rainy and dry season. I have chosen
>> to perform
>>> lmer for each season and habitat separately. For each tree, I have
>> collected 10
>>> fig fruits to count offspring of wasps at the trophic level (pollinator,
>>> parasitoid) and to measure the size of the fruit (volume). As I have 10 fig
>>> fruits per tree, I would use tree as a random effect, to account for
>> unmeasured
>>> variance between trees. I have found different approaches to that in
>>> literature. For instance: We did not include‘ tree ’ and ‘
>>> syconium ’ as random factors, because wasps were free to move between
>> syconia
>>> on the same tree, and between trees.
>>>
>>> Therefore,
>>> I am not sure if I have to use GLMM with trees as a random effect or
>> if it’s
>>> also possible to use GLM without the random effect.
>>>
>>>
>>>
>>> The inclusion of
>>> fig volume in this manner removed the need to use “fig” as an
>> additional nested
>>> factor within “tree”. Is that right?
>>>
>>>
>>>
>>> Here, I
>>> present the model with the random effect.
>>>
>>> So, lets start with the model of the farmland during
>>> the dry season:
>>>
>>>
>>>
>>> fad <-
>> lmer(Pollinator~Parasitoids+volume+(1|tree),family="poisson",verbose=TRUE)
>>>
>>>
>>> summary (fad)
>>>
>>> Generalized linear mixed model fit by the Laplace
>>> approximation
>>>
>>> Formula: Pollinator ~ Parasitoids + volume + (1 |
>>> tree)
>>>
>>> Data:
>>> fad.data
>>>
>>> AIC BIC logLik deviance
>>>
>>> 1166 1175
>>> -578.9 1158
>>>
>>> Random effects:
>>>
>>> Groups
>>> Name Variance Std.Dev.
>>>
>>> tree (Intercept) 0.10634 0.3261
>>>
>>>
>>> Number of obs: 80, groups: tree, 8
>>>
>>>
>>>
>>> Fixed effects:
>>>
>>>
>>> Estimate Std. Error z value Pr(>|z|)
>>>
>>> (Intercept)
>>> 2.3167867 0.1632747 14.190
>>> <2e-16 ***
>>>
>>> Parasitoids -0.0046908
>>> 0.0021686 -2.163 0.0305 *
>>>
>>>
>>> volume
>>> 0.0069525 0.0006108 11.383
>>> <2e-16 ***
>>>
>>> ---
>>>
>>> Signif.
>>> codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05
>>> ‘.’ 0.1 ‘ ’ 1
>>>
>>>
>>>
>>> Correlation of Fixed Effects:
>>>
>>>
>>> (Intr) Prstds
>>>
>>> Parasitoids -0.160
>>>
>>> volume
>>> -0.657 -0.107
>>>
>>>
>>>
>>> 1. Random effects: What does the Random
>>> Effect table - the Variance, Std. Dev. and Intercept tells me?
>>>
>>>
>> Your model assumes that the outcome is Poisson with expected value
>> exp(beta0 + beta1*parasitoids + btree)
>>
>> btree is a unique added amount for each tree. The estimate of the
>> number btree's variance across trees is 0.1.
>>
>> What that 0.1 means in terms of the predicted outcome? Well, that
>> mostly depends on how big beta0 + beta1*parasitoids is. If that
>> number is huge, say 1000, then adding a thing with variance 0.1 won't
>> matter much.
>>
>> On the other hand, if it is 0.01, then the random effect at the tree
>> level is very large, compared to the systematic components in your
>> model. When the link function gets applied, the distribution of
>> outcomes changes in an interesting way.
>>
>> If you run ranef(), it will spit out the estimates of the random
>> differences among trees (btree "BLUPS"). If you run the predict
>> method, you can see how those map out to predicted values (exp(beta0 +
>> beta1 parasitoids + btree)
>>
>>> So it says that the
>>> between–tree variability is fairly large. But I don’t understand to what it
>>> relates. Does it mean that pollinator offspring variance is high between
>>> trees and might be overestimating the explanatory variables?
>>>
>>>
>>>
>>> 2. What can I
>>> conclude from the model regarding the fixed effects and how to report
>> about it
>>> (with or without p-values, z-values, estimates)?
>>>
>>>
>>>
>>> So of what I know is that
>>> the p-value is only a guide and that it is rather a comparison of two
>> models.
>>> What are the two models and can I compare them.
>>>
>>>
>> I am puzzled why you see p values at all. In the version of lme4 I'm
>> running now, I don't see p values.
>>
>> Lets compare versions, since I'm pretty sure p values were removed
>> quite a while ago.
>>
>>> sessionInfo()
>> R version 3.1.2 (2014-10-31)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> locale:
>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] lme4_1.1-8 Matrix_1.2-2
>>
>> loaded via a namespace (and not attached):
>> [1] grid_3.1.2 lattice_0.20-33 MASS_7.3-43 minqa_1.2.4
>> [5] nlme_3.1-121 nloptr_1.0.4 Rcpp_0.12.0 splines_3.1.2
>> [9] tools_3.1.2
>>
>> Anyway...
>>
>> If you had a huge sample, those p values would be accurate.
>>
>> You have a small sample, there are other, more computationally
>> intensive ways, to get p values. Read the Jrnl Stat Software paper b
>> y the lmer team, they describe profiling and bootstrapping. You have
>> small enough sample, could do either one.
>>
>>> 4. What does the
>>> correlation of fixed effects tells me?
>>>
>> It is a hint about multicollinearity & numerical instability, so far as
>> I know.
>>
>>>
>>> 5. Is it right
>>> that the estimates tells me whether the relation of the fixed effects to
>>> pollinator offspring is positive or negative?
>>>
>>>
>> Best way to get answer is to plot the predicted values from the model.
>> Use the predict function to plot for various values of the predictor.
>>
>>> 6. Can I
>>> calculate an effect size on each explanatory variable?
>>>
>> Only if you think the term "effect size" is meaningful. And if you
>> have a formula for one. In my experience with consulting here, it
>> means anything the researcher wants to call a summary number.
>>
>> I've come to loathe the term because somebody in the US Dept. of
>> Education mandated all studies report standardized effect sizes,
>> forcing everybody to make Herculean assumptions about all kinds of
>> model parameters to get Cohen's D or whatnot.
>>
>>>
>>> I would highly appreciate your feedback on this.
>>> Thanks so much in advance.
>>>
>> Good luck. Next time, use a text only email composer and try to ask 1
>> specific question. You are more likely to get attention if people can
>> easily read the message and see what you want. This one was difficult
>> to read (for me at least) and also somewhat vague.
>>
>>>
>>> Best wishes,
>>>
>>> Yvonne Hiller
>>>
>>>
>>>
>>>
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>>
>>> _______________________________________________
>>>
>> R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org>
>> mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>>
>>
>> --
>> Paul E. Johnson
>> Professor, Political Science Director
>> 1541 Lilac Lane, Room 504 Center for Research Methods
>> University of Kansas University of Kansas
>> http://pj.freefaculty.org http://crmda.ku.edu
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org>
>> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
More information about the R-sig-mixed-models
mailing list