[R-sig-ME] Most principled reporting of mixed-effect model regression coefficients

Mon Feb 17 10:35:53 CET 2020

> Thanks, Maarten. So I was planning on reporting R^2 (along with AIC) for the overall model fit, not for each predictor, since the regression coefficients themselves give a good indication of relationship (though I wasn't aware that R^2 is "riddled with complications") Is Henrik only saying this only with regard to LMMs and GLMMs?

That makes sense to me. For the overall model fit I would probably
still go with Johnson's version [1] which I describe in my
StackExchange post (and I think you mentioned it, or the Nakagawa and
Schielzeth version it is based on, earlier) and report both the
marginal and conditional R^2 values. The regression coefficients
provide unstandardized effect sizes on the response scale which I
think are a valid way to report effect sizes (see below).
I think Henrik refers to (G)LMMs and gives Rights & Sterba (2019) [2]
as reference. Also, the GLMM FAQ website provides a good overview [3].

> When you say "there is no agreed upon way to calculate effect sizes" I'm a little confused. I read through your stack exchange posting, but Henrik's answer refers to standardized effect size. You write, later down, "Whenever possible, we report unstandardized effect sizes which is in line with general recommendation of how to report effect sizes"

What you cite is still Henrik's opinion (and I hoped that I could make
this clear by writing "This is what he suggests [...]" and by using
the <blockquote> on StackExchange). And your citation still refers to
LMMs as he says "Unfortunately, due to the way that variance is
partitioned in linear mixed models (e.g., Rights & Sterba, 2019),
there does not exist an agreed upon way to calculate standard effect
sizes for individual model terms such as main effects or
interactions."
In general, I agree with him and with his recommendation to report
unstandardized effect sizes (e.g. regression coefficients) if they
have a "meaningful" interpretation.
The semi-partial R^2 I mentioned in my last e-mail is an
additional/alternative indicator of effect sizes that is probably more
in line with what psychologists are used to see reported in papers
(especially when results of factorial designs are reported) - and
that's the reason I mentioned it.

> I'm also working on a systematic review where there's disagreement over whether effect sizes should be standardized, but it does seem that yield any kind of meaningful comparison, effect sizes would have to be standardized. I don't usually report standardized effect sizes...however, there are times when I z-score IVs to put them on the same scale, and I guess the output of that would be a standardized effect size. I wasn't aware of push back on that practice. What issues would arise from this?

There is nothing wrong with standardizing (e.g. by diving by 1 or 2
standard deviations) predictor variables to get measures of variable
importance (within the same model).
Issues arise when standardized effect sizes such as R^2, partial
eta^2, etc. between different models are compared without thinking
about what differences in these measures can be attributed to (see
e.g. this question [4] or the Pek & Flora (2018) paper [5] that Henrik
cites). Note that these are general issues that apply to all
regression models, not only mixed models.

[1] https://doi.org/10.1111/2041-210X.12225
[2] https://doi.org/10.1037/met0000184
[3] https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#how-do-i-compute-a-coefficient-of-determination-r2-or-an-analogue-for-glmms
[4] https://stats.stackexchange.com/questions/13314/is-r2-useful-or-dangerous/13317
[5] https://doi.org/10.1037/met0000126

Best,
Maarten