[R-sig-ME] Cluster-robust SEs & random effects -- seeking some clarification

Wed Jul 27 18:04:51 CEST 2022

Hi J.D.,

Responses inline below.

James

On Tue, Jul 26, 2022 at 10:42 PM J.D. Haltigan <jhaltiga using gmail.com> wrote:

> Many thanks for this detailed and insightful exposition, James.
>
> A few follow-ups:
>
> I had previously tried cluster-robust SEs with both the robustlmm package
> and now yours, and it appears I don't have the memory needed given the size
> of the data as I receive the following error:
> #Error in .local(x, y, ...) :
> #Cholmod error 'problem too large' at file ../Core/cholmod_sparse.c, line
> 89
> In Googling this error message, I see it is likely due to the computational
> demands of a sparse matrix estimation, but was wondering if there were any
> other aspects of this I could explore.
>
>
To figure out what's going on, we need to know the dimensions of your
model. How many coefficients (fixed effects) are in your regression
specification? How many clusters do you have? What is the range of cluster
sizes?

> In regards to #4: I am invoking random effects to see how sensitive a fixed
> effects model (with cluster robust SEs) is to formally estimating the
> random cluster effect (so between-cluster variance). In the fixed effects
> model, the investigators did include a factor variable (i.e., cluster
> dummies as you describe below) that is nested within cluster (so, a pair
> variable indicating treatment-control village), but my predilection is that
> despite this, there are other sources of between-cluster variance that will
> likely nullify the point estimates of the fixed effects (in this case, a
> mask intervention).

I'm not sure what you mean by "other sources of between-cluster variance
that will likely nullify the point estimates of the fixed effects." Can you
explain in a bit more detail?

> So, if I am formally modeling the random cluster
> component, what does adding cluster-robust SEs in this case provide in
> terms of inference--both for the fixed effects and for the random effects?
>

Cluster-robust SEs are only relevant for inference on the regression
coefficients (fixed effects)--not for the random effects. For the fixed
effects, using cluster-robust SEs provides robustness to misspecification
of the random effects model, such as omission of a random slope that should
be there. In the context of your example, there might be heterogeneity of
intervention effects across pairs of villages. A random effects model that
only includes random intercepts for each village will not "catch" this sort
of treatment effect heterogeneity, and model-based SEs could be invalidated
by it. Cluster-robust SEs will account for that sort of heterogeneity (and
be larger than the model-based SEs as a consequence).

	[[alternative HTML version deleted]]