[R-sig-ME] For what can I use a correlation of fixed effects from (g)lmer?
steven.v.miller at gmail.com
Mon Feb 22 21:46:17 CET 2016
I have a question that concerns how could I possibly use a correlation of
fixed effects that comes standard with every (g)lmer call. I'll explain the
situation I'm encountering briefly.
- I used mixed effects models mostly for cross-national survey research.
I have both individual-level fixed effects and country-level fixed effects.
- My interest is mostly the country-level fixed effects. The
individual-level stuff tends to be standard "controls" that reviewers want
- I'm not convinced the individual-level fixed effects are entirely
necessary. My hunch is they just make for inefficient estimates of the
country-level fixed effects that interest me. The individual-level
variables just create missing data problems. However, they're stuff that
reviewers insist on seeing absent any other information about what a mixed
effects model is doing.
I have a project (manuscript here:
https://www.dropbox.com/s/harb6ylpcxdpalr/etst.pdf?dl=0 | appendix here:
reviewers rejected because the country-level fixed effects were rendered
statistically insignificant (i.e. not discernible from zero) upon the
inclusion of the individual-level variables. They said that one
individual-level attribute (which by itself contributes to listwise
deletion of 30% of the data) somehow made the country-level fixed effects
"spurious" to its inclusion. This already strikes me as a bold claim for
theoretical and statistical reasons, but here's what I did to circumvent
- Estimate just the country-level fixed effects.
- Use multiple imputation to generate five full data sets and merge in
the macro-level information after the imputation. The results for the
country-level fixed effects were almost identical to the analyses with just
the country-level fixed effects.
- Omit the offending individual-level variables that contribute the most
missingness. These results were consistent with the results from the other
two estimation strategies.
However, the reviewers just didn't buy it and torpedoed the manuscript.
Is this something that the correlation of fixed effects could be useful in
addressing? Here's the correlation of fixed effects (without the
intercepts) for the analysis in question. In this analysis, the three
variables at the bottom row (i.e. the two threat indices and the level of
democracy) are the country-level variables for this cross-national survey
analysis. The other variables are individual-level attributes. It's worth
reiterating that every variable that is not binary is scaled by two
standard deviations to create a meaningful zero.
Notice that the bottom-left quadrant is entirely white (i.e. the
correlation of the individual-level fixed effects with the country-level
fixed effects is basically zero). Is this telling me that the correlation
for any one individual-level fixed effect and a country-level fixed effect
is almost zero (i.e. they have almost no bearing on each other)? The most
I've seen anyone discuss this correlation matrix is here:
It is an approximate correlation of the estimator of the fixed
effects. (I include the word "approximate" because I should but in
this case the approximation is very good.) I'm not sure how to
explain it better than that. Suppose that you took an MCMC sample
from the parameters in the model, then you would expect the sample of
the fixed-effects parameters to display a correlation structure like
and here (
The "correlation of fixed effects" output doesn't have the intuitive
meaning that most would ascribe to it. Specifically, is not about the
correlation of the variables (as OP notes). It is in fact about the
expected correlation of the regression coefficients. Although this may
speak to multicollinearity it does not necessarily.
I should add that I've estimated hundreds of mixed effects models with
individual-level and country-level variables and they all have fixed
effects correlation matrices that resemble these. I have a strong hunch
that individual-level variables don't meaningfully influence the parameter
estimates for country-level variables beyond inefficiency introduced by
missing data. In research projects where individual-level attributes don't
concern the project, I'd like to ignore them for that reason. They just
create estimation problems and slow down computation.
I might be mistaken, which is why I ask here. I thank you for your time.
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models