[R-sig-ME] Cluster-robust SEs & random effects -- seeking some clarification

J.D. Haltigan jh@|t|g@ @end|ng |rom gm@||@com
Sat Jul 30 23:25:12 CEST 2022

This is a very helpful walkthrough, James. My responses are italicized
under yours to maintain thread readability. The key is Generalizability
here and (as I also note in my last reply) the idea is to Generalize to a
universe of "any villages or clusters." That is, the target population we
are generalizing to is *any* random population.

On Sat, Jul 30, 2022 at 3:01 PM James Pustejovsky <jepusto using gmail.com> wrote:

> Hi J.D.,
> A few comments/reactions inline below.
> James
> On Wed, Jul 27, 2022 at 5:37 PM J.D. Haltigan <jhaltiga using gmail.com> wrote:
>> ...
> In the original investigation, the authors did not invoke a random effects
>> model (but did use the pairIDs to control for fixed effects as noted and
>> with robust SEs). Thus, in the original investigation there was *no*
>> specification of a random effects model for the 'cluster' variable. We know
>> from some other work there were some biases in village mapping and other
>> possible sources of between-cluster variation that might be anticipated to
>> have influence--at the random intercepts level--so we are looking into how
>> specifying 'cluster' as a random effect might change the fixed effects
>> estimates for the treatment intervention effect. In the Hamaker et al.
>> language, it is indeed a 'random intercepts' only model.
> I don't follow how using a random intercepts model improves the
> generalizability warrant here. The random intercepts model is essentially
> just a re-weighted average of the pair-specific effects in the original
> analysis, where the weights are optimally efficient if the model is
> correctly specified. That last clause carries a lot of weight here--correct
> specification means 1) treatment assignment is unrelated to the random
> effects, 2) the treatment effect is constant across clusters, 3)
> distributional assumptions are valid (i.e., homoskedasticity at each level
> of the model).
> If the effects are heterogeneous, then I would think that including random
> slopes on the treatment indicator would provide a better basis for
> generalization. But even then, the warrant is still pretty vague---what is
> the hypothetical population of villages from which the observed villages
> are sampled?

*In the most basic model (without baseline controls) the model takes the
form: myModel = lmer(posXsymp~treatment + pairID + (1 | union), data =
myData). I believe--correct me if I am wrong--that this reflects a
random-intercepts only model, but I may be mistaken. If I am, and this is
allowing for random slopes on the treatment indicator, then I will need to
rethink my statements.  *

>> Given this, however, does it also make sense to include the cluster
>> robust SEs for the fixed effects which would account for possible
>> heterogeneity of treatment effects (i.e., slopes) across clusters?s
>> If you're committed to the random intercepts model, then yes I think so
> because using cluster robust SEs at least acknowledges the possibility of
> heterogeneous treatment effects.

*If the above model does allow for both random intercepts and slopes, then
perhaps the use of cluster robust SEs is redundant in some sense since the
random slopes would be modeling the heterogeneity in treatment effects?*

>> Bottom line: in their original analyses, clusters are seen as
>> interchangeable from a conceptual perspective (rather than drawn from a
>> random universe of observations). When one scales up evidence to a universe
>> of observations that are random (as they would be in the intended universe
>> of inference in the real-world), then we are better positioned, I think, to
>> adjudicate whether the mask intervention effect is 'practically
>> significant' (in addition to whether the focal effect remains marginally
>> significant from a frequentist perspective).
> As noted above, this argument is a bit vague to me. If there's concern
> about generalizability, then my first question would be: what is the target
> population to which you are trying to generalize?

*Essentially, the target population we are trying to generalize to is a
random selection of villages. Any random selection of villages. In other
words, villages should not be seen as interchangeable. We are interested in
whether the effects generalize to any randomly selected village. *


	[[alternative HTML version deleted]]

More information about the R-sig-mixed-models mailing list