[R-sig-ME] Mixed model interpretation with interaction

Sun Jun 9 16:57:06 CEST 2019

Hi,

I don't know if this adds anything new but the most direct answers that
come into my mind would be.
1) It seems you use dummy coding, and this defines the interpretation of
the estimated coefficients, which would be different from (often preferred
because more easy to interpret)  effect / or contrast coding (dummy coding
has some 'fitting' advantages which are mainly discussed with respected to
centered vs. non-centered likelihood (or least-square-mean) estimation
processes, which might be insightful for you to look up in the internet);
but any coding design you use will eventually simply try to estimate
cell-means (in your case on a log scale), and you need to check how to get
these cell means out of your coefficients (via back-transformation). One
way of doing this is by using marginal predictions, as Daniel points out.

2) For another (technical) illustration: a test-design matrix as yours with
(e.g.) 2 feeding sites and 2 years, then it would be a 2(site 1 vs. site 2)
by 2(year 1 vs year 2) independent measures design; or 2 x 2 for short,
which could be simply expressed by 4 probabilities or by using means on a
log scale, one mean for each of the design-cells, which would be the
"centered" variant of estimation; but usually dummy coding implies a
non-centered (but mathematically equivalent  - standard) coding:
If the model is:
y = site+year (ignoring random effects now), then
cellmean(site1:year1) = Model_Intercept
cellmean(site1:year2) = Model_Intercept + year2
cellmean(site2:year1) = Model_Intercept + site2
cellmean(site2:year2) = Model_Intercept + site2 + year2

mean(site1) = (2*Model_intercept + year2)/2
mean(site2) = ( 2(Model_intercept + site2)+year2))/2
and so on...
(Where intercept in most estimation methods is by default is defined in
reference to the first level of the first predictor in the equation; thus
site1 (+year1, which is 0 in this type of coding); but the reference point
can be changed manually)

If the model is:
y=site+year+site:year, then
cellmean(site1:year1) = Model_Intercept
cellmean(site1:year2) = Model_Intercept + year2
cellmean(site2:year1) = Model_Intercept + site2
cellmean(site2:year2) = Model_Intercept+site2+ year2 +   site2:year2

Where only the fourth equation changes, which nontheless can have a huge
impact on the estimation of the other parameters

(usually R outputs the reference levels for the intercept and the
coefficients, which you can easily identify)
In case there are more sites than two... e.g.. 4 of them, then:
cellmean(site1:year1) = Model_Intercept
cellmean(site2:year1) = Model_Intercept + site2
cellmean(site3:year1) = Model_Intercept + site3
cellmean(site4:year1) = Model_Intercept + site4

You might get the gist :)

Finally, if you actually want to test for an overall interaction in this
way (or main effects), looking at these coefficients is not meaningful,
which you can tell by just looking at the formulas above...,  So you might
want to do it differently (correctly), namely by using likelihood ratio
tests:
(in R like coding)

Model1<- y=site+year+site:year
vs
Model2<- y=site+year

with
anova(Model1,Model2)  (I think aov() should work as well)
If the interaction of both variables is significant (i.e. the anova()
output gives a * for the comparison between Model 1 and Model 2... :)))
then the interaction effect explains some 'significant' amount of variance.
(If there is no *, you can consider the models as equal in terms of
explained variance). Same for other effects (e.g. full model vs. model a
specific main effect).
Maybe Check whether the "afex::mixed" function which does this for you in a
sensible way (there are different ways of doing LRT tests...)
;))

Having done this in the first place, is often viewed as prerequisite for
'digging' into the model estimates (as discussed above) to find out, what
significant then actually means in terms of 'mean-changes' :)

Hope this helps,
Best, René

Am So., 9. Juni 2019 um 12:45 Uhr schrieb <d.luedecke using uke.de>:

> Dear Patricia,
>
> when you include an interaction, your assumption is that the relationship
> between an independent X1 and the dependent variable Y varies *depending on
> the values of another independent variable X2*. Indeed, for logistic
> regression models (as well as for many models with non-Gaussian families),
> the interpretation of interaction terms can be tricky. In such cases, I
> would recommend to compute (at least additionally) marginal effects, which
> give you an intuitive output of your results.
>
> You can do so e.g. with the "ggeffects" package (
> https://strengejacke.github.io/ggeffects/), and there is also an example
> for a logistic mixed effects model (
> https://strengejacke.github.io/ggeffects/articles/practical_logisticmixedmodel.html),
> which might help you.
>
> In your case, the code would be
> ggpredict(M1, c("feed", "year")) for the model with interaction. If you
> want to plot the results, simply call
> me <- ggpredict(M1, c("feed", "year"))
> plot(me)
>
> A comment on your model: I'm not sure, but if you compare subjects (or
> feeding sites) at two time points, you might want to model the
> auto-correlation of subjects / feeding site ("repeated measure") using your
> time variable as random slope:
>
> M1 <- glmer((bear_pres ~  feed * year + (1 + year | Feeding.site), family
> = binomial, data = df10)
>
> Computing marginal effects than would be the same function call:
> ggpredict(M1, c("feed", "year"))
>
>
> Best
> Daniel
>
>
> -----Ursprüngliche Nachricht-----
> Von: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> Im
> Auftrag von Patricia Graf
> Gesendet: Sonntag, 9. Juni 2019 09:17
> An: r-sig-mixed-models using r-project.org
> Betreff: [R-sig-ME] Mixed model interpretation with interaction
>
> Hello,
>
>
>
> I have a few questions concering the interpretation of a GLMM output table
> when the model includes an interaction.
>
> We want to analyse bear presence at feeding sites (bear_pres) related to
> the year (two years: 2016, 2017) and the feed supplied at feeding sites
> (carrion, maize). So the response is binary (0 = no bear present, 1 = bear
> present within 5-min intervals over the whole day) and both predictors are
> categorical, we include feeding site ID as random factor.
>
>
>
> The model includes some other variables too but for simplicity I just use
> those two variables for explanation.
>
>
>
> 1) As I understand, in a model without interaction, the interpretation of
> the results would be as follows:
>
>
>
> M1 <- glmer((bear_pres ~  feed + year + (1|Feeding.site), family=binomial,
> data=df10)
>
> Fixed effects:
>
>            Estimate Std. Error z value Pr(>|z|)
>
> (Intercept) -4.58524    0.08529 -53.76   <2e-16 ***the intercept is bear
> presence at maize sites in 2016
>
> feedcarrion    0.39178    0.02139   18.32  <2e-16 ***bear presence at
> feeding sites in 2017 compared to 2016
>
> year2017    0.23027    0.01978   11.64  <2e-16 ***bear presence at carrion
> feeding sites compared to maize feeding sites
>
>
>
> Is this interpretation right?
>
>
>
>
>
> 2) To my knowledge, the output changes when you include an interaction:
>
>
>
> M2<- glmer(bear_pres ~  year*feed + (1|Feeding.site), family=binomial,
> data=df10)
>
> Fixed effects:
>
>                   Estimate Std. Error z value Pr(>|z|)
>
> (Intercept)       -4.36413    0.10730 -40.67  < 2e-16 ***the intercept is
> bear presence at maize sites in 2016 (baseline)
>
> year2017          -0.18010    0.05119  -3.52 0.000434 ***difference in bear
> presence in 2017 compared to 2016 for maize
>
> feedcarrion          -0.02933    0.05318  -0.55 0.581222    difference in
> bear presence at carrion sites compared to maize sites in 2016
>
> year2017:feedcarrion  0.85275   0.09953    8.57  < 2e-16 ***difference in
> bear presence at carrion sites 2017 and the sum of ß0+ ß1+ ß2
>
>
>
> So to my questions: Is this interpretation right? What is the coding of the
> model so it does produce this output, e.g. why is the year not comparing
> 2016 to 2017 anymore as in the model without the interaction? Or why
> doesn’t the model still use the two food types for comparison?
>
>
>
> As I understand, when you include an intercation between the two binary
> dummy-coded categorical variables, the interpretation of what was main
> effects before (year, carrion) changes, and so do the betas (these are
> called „simple effects“ afterwards).
>
>
>
> In my group, there is a strong believe that in M2, the year still compares
> the two years (and so does feed), it’s just the coefficient cannot be
> interpreted anymore. Also, there is a believe that the interaction term
> compares to feedmaize in the year 2016.
>
>
>
> If my interpreation is correct, I need some background on how the algorithm
> works, how simple effects evolve and why the interaction should be
> interpreted as in the output table of M2.
>
>
>
> Thank you for your help in advance!
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> --
>
> _____________________________________________________________________
>
> Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen
> Rechts; Gerichtsstand: Hamburg | www.uke.de
> Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Prof. Dr. Dr.
> Uwe Koch-Gromus, Joachim Prölß, Marya Verdel
> _____________________________________________________________________
>
> SAVE PAPER - THINK BEFORE PRINTING
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]