[R-sig-ME] For what can I use a correlation of fixed effects from (g)lmer?

Wed Feb 24 23:08:53 CET 2016

Hi Malcolm,

Thanks for the response. I actually cite your 2014 PSRM piece in defense of
that argument. I know we're not using the same language in a similar
approach, but I remember you arguing for the exclusion of individual-level
covariates because it would just contribute to missingness and not help
with your overall research question. We're both using WVS data too.
Sometimes, missingness is not random (e.g. WVS not asking about respondent
ideology in several important countries [like China] in one of my projects).

I did want to clarify that my mean-centering approach is inspired by Gelman
(2008), who argues to scale by two standard deviations. When I have two or
three waves, the inclusion of one or more predictors may drop out an entire
wave (e.g. EVS not asking about respondent's education levels until the
third wave). So, I try to scale on the country-wave (for individual-level
variables like age) or the survey wave (at the macro-level attributes like
a country's level of democracy). For example, here's what I do with a
respondent's age in EVS:

EVS <- ddply(EVS, c("ccode","wave"), transform, zg.age = arm::rescale(x003))

and here's what I do with level of democracy (UDS data).

Macro.EVS <- ddply(Macro.EVS, c("wave"), transform, zg.udsmean =
arm::rescale(udsmean))

In the example I cite above, only the bottom three rows in the correlation
matrix are country-level (really: country-year-level) covariates.
Everything else is an individual-level variable.

And yeah, I've never used a correlation of fixed effects before for
anything concerning the model I estimate. I'm curious if I actually could
use that as justification to stop flooding models with individual-level
variables that (I think) don't meaningfully influence the country-level
variables of interest to me (beyond introducing non-random missingness).

- Steve

On Wed, Feb 24, 2016 at 4:04 PM, Malcolm Fairbrother <
M.Fairbrother at bristol.ac.uk> wrote:

> Dear Steve,
> There's a lot in your question. A couple thoughts:
> (1) I'm not clear whether you *country-mean-centered* your
> individual-level covariates. If not, the country means of those variables
> could (i.e., almost certainly will) correlate to some degree with your
> country-level variables. This will almost certainly just confuse matters,
> such that it would be best to do the mean-centering. (If you want to
> include, say, national mean education as a covariate, in addition to
> de-meaned individual level education, you can do so... But that will
> probably correlate a lot with, say, GDP/capita.) From what you say,
> mean-centering will get you what you want, and it actually might also help
> you deal with the unhelpful reviewer comments you're getting. (I totally
> agree with your reactions to those. Given what appears to the paucity of
> logic behind their comments, surreptitiously not doing what they're saying
> but appearing to do what they're saying seems a reasonable strategy.
> Implicitly including country means increasing your degrees of freedom at
> the country level, causing a reduction in efficiency, as you suggest...
> Though it's an issue of collinearity, not just missingness.)
> So I think you're wrong that "individual-level variables don't
> meaningfully influence the parameter estimates for country-level variables
> beyond inefficiency introduced by missing data." But I think you can
> nonetheless ignore them--because only the country mean components are
> having the impacts you describe, and you seem to have substantive reasons
> to remove those components.
> (2) Like you, I've never found the "correlation of fixed effects" output
> very useful. I generally just suppress/ignore it.
> Hope that helps.
> - Malcolm
>
>
> Dr Malcolm Fairbrother
> Senior Lecturer in Global Policy and Politics
> School of Geographical Sciences
> University of Bristol
>
>
>
>
> Date: Mon, 22 Feb 2016 15:46:17 -0500
>> From: svm <steven.v.miller at gmail.com>
>> To: r-sig-mixed-models at r-project.org
>> Subject: [R-sig-ME] For what can I use a correlation of fixed effects
>>         from    (g)lmer?
>>
>> Hi all,
>>
>> I have a question that concerns how could I possibly use a correlation of
>> fixed effects that comes standard with every (g)lmer call. I'll explain
>> the
>> situation I'm encountering briefly.
>>
>>
>>    - I used mixed effects models mostly for cross-national survey
>> research.
>>    I have both individual-level fixed effects and country-level fixed
>> effects.
>>    - My interest is mostly the country-level fixed effects. The
>>    individual-level stuff tends to be standard "controls" that reviewers
>> want
>>    to see.
>>    - I'm not convinced the individual-level fixed effects are entirely
>>    necessary. My hunch is they just make for inefficient estimates of the
>>    country-level fixed effects that interest me. The individual-level
>>    variables just create missing data problems. However, they're stuff
>> that
>>    reviewers insist on seeing absent any other information about what a
>> mixed
>>    effects model is doing.
>>
>>
>> I have a project (manuscript here:
>> https://www.dropbox.com/s/harb6ylpcxdpalr/etst.pdf?dl=0 | appendix here:
>> https://www.dropbox.com/s/pq8gmr7v1xvvu2h/etst-appendix.pdf?dl=0) that
>> reviewers rejected because the country-level fixed effects were rendered
>> statistically insignificant (i.e. not discernible from zero) upon the
>> inclusion of the individual-level variables. They said that one
>> individual-level attribute (which by itself contributes to listwise
>> deletion of 30% of the data) somehow made the country-level fixed effects
>> "spurious" to its inclusion. This already strikes me as a bold claim for
>> theoretical and statistical reasons, but here's what I did to circumvent
>> this claim:
>>
>>
>>    - Estimate just the country-level fixed effects.
>>    - Use multiple imputation to generate five full data sets and merge in
>>    the macro-level information after the imputation. The results for the
>>    country-level fixed effects were almost identical to the analyses with
>> just
>>    the country-level fixed effects.
>>    - Omit the offending individual-level variables that contribute the
>> most
>>
>>    missingness. These results were consistent with the results from the
>> other
>>    two estimation strategies.
>>
>>
>> However, the reviewers just didn't buy it and torpedoed the manuscript.
>>
>> Is this something that the correlation of fixed effects could be useful in
>> addressing? Here's the correlation of fixed effects (without the
>> intercepts) for the analysis in question. In this analysis, the three
>> variables at the bottom row (i.e. the two threat indices and the level of
>> democracy) are the country-level variables for this cross-national survey
>> analysis. The other variables are individual-level attributes. It's worth
>> reiterating that every variable that is not binary is scaled by two
>> standard deviations to create a meaningful zero.
>>
>> http://i.imgur.com/eIiZH9b.png
>>
>> Notice that the bottom-left quadrant is entirely white (i.e. the
>> correlation of the individual-level fixed effects with the country-level
>> fixed effects is basically zero). Is this telling me that the correlation
>> for any one individual-level fixed effect and a country-level fixed effect
>> is almost zero (i.e. they have almost no bearing on each other)? The most
>> I've seen anyone discuss this correlation matrix is here:
>>
>> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2009q1/001941.html
>>
>> It is an approximate correlation of the estimator of the fixed
>> effects.  (I include the word "approximate" because I should but in
>> this case the approximation is very good.)  I'm not sure how to
>> explain it better than that.  Suppose that you took an MCMC sample
>> from the parameters in the model, then you would expect the sample of
>> the fixed-effects parameters to display a correlation structure like
>> this matrix.
>>
>>
>> and here (
>>
>> http://stats.stackexchange.com/questions/57240/how-do-i-interpret-the-correlations-of-fixed-effects-in-my-glmer-output
>> ):
>>
>>
>> The "correlation of fixed effects" output doesn't have the intuitive
>> meaning that most would ascribe to it. Specifically, is not about the
>> correlation of the variables (as OP notes). It is in fact about the
>> expected correlation of the regression coefficients. Although this may
>> speak to multicollinearity it does not necessarily.
>>
>>
>> I should add that I've estimated hundreds of mixed effects models with
>> individual-level and country-level variables and they all have fixed
>> effects correlation matrices that resemble these. I have a strong hunch
>> that individual-level variables don't meaningfully influence the parameter
>> estimates for country-level variables beyond inefficiency introduced by
>> missing data. In research projects where individual-level attributes don't
>> concern the project, I'd like to ignore them for that reason. They just
>> create estimation problems and slow down computation.
>>
>> I might be mistaken, which is why I ask here. I thank you for your time.
>>
>>
>>

	[[alternative HTML version deleted]]