[R-sig-ME] Cluster-robust SEs & random effects -- seeking some clarification

J.D. Haltigan jh@|t|g@ @end|ng |rom gm@||@com
Sun Jul 31 02:12:41 CEST 2022


Thanks, Ben. So in the model you remarked on, would that be a
'random-intercepts only' model?


On Sat, Jul 30, 2022 at 7:53 PM Ben Bolker <bbolker using gmail.com> wrote:

> I haven't been following the whole thread that carefully, but I want to
> emphasize that
>
>    posXsymp~treatment + pairID + (1 | union)
>
> is *not*, by any definition I'm familiar with, a "random-slopes model";
> that is, it only estimates a single population-level treatment
> effect/doesn't allow the effect of treatment to vary across groups
> defined by 'union'.  You would need a random-effect term of the form
> (treatment | union).
>
>    Reasons why you might *not* want to do this:
>
>   * if treatment only varies across and not within levels of union
> ("union is nested within treatment" according to some terminology), then
> this variation is unidentifiable
>   * maybe you have decided that you don't have enough data/want a more
> parsimonious model.
>
>    Schielzeth and Forstmeier, among many others (this is the example I
> know of), have cautioned about the consequences of leaving out
> random-slopes terms.
>
> Schielzeth, Holger, and Wolfgang Forstmeier. “Conclusions beyond
> Support: Overconfident Estimates in Mixed Models.” Behavioral Ecology
> 20, no. 2 (March 1, 2009): 416–20. https://doi.org/10.1093/beheco/arn145.
>
>
> On 2022-07-30 7:43 p.m., J.D. Haltigan wrote:
> > Addendum:
> >
> > It just occurred to me on my walk that I think I am getting a bit lost in
> > some of the differences in nomenclature across scientific silos. In the
> > original model that they specified, which treated the 'pairID' variable
> as
> > a control variable for which they controlled for 'fixed effects' of
> > control/treatment villages (in their own language in the paper) using
> > cluster-robust SEs, I think this is indeed a 'random-intercepts only'
> model
> > in the language of Hamaker et al. They implement the 'absorb' command in
> > STATA which I believe aggregates across the pairIDs to generate an
> > 'omnibus' F-test of sorts for the pairID variable (in the ANOVA
> > nomenclature). I say this as when I specify the pairID variable in the
> lmer
> > model I shared (or in a fixest model I conducted to replicate the
> original
> > Abalauck results in R), I get the estimates for all the pairs (i.e.,
> there
> > is no way to aggregate across them--though I think formally the models
> are
> > the same if we are unconcerned about any one pairID [treatment/control
> > village pair].
> >
> > So, in the lmer model I shared where I specify a specific random effects
> > term for the 'cluster' variable, I think this indeed is allowing for
> random
> > slopes across the clusters which implies the treatment effect may vary
> > across the clusters (and we might anticipate it will for various reasons
> I
> > can elaborate on). More generally: we are generalizing to *any* universe
> of
> > villages (say in the entire world) where the treatment intervention
> (masks)
> > may vary across villages. This is the crux of invoking the random effects
> > model (i.e., random slopes model).
> >
> > I realize this is a mouthful, but I think the way these terms (e.g.,
> > random/fixed effects models etc.) are used across disciplines makes
> things
> > a bit confusing.
> >
> > On Sat, Jul 30, 2022 at 5:25 PM J.D. Haltigan <jhaltiga using gmail.com>
> wrote:
> >
> >> This is a very helpful walkthrough, James. My responses are italicized
> >> under yours to maintain thread readability. The key is Generalizability
> >> here and (as I also note in my last reply) the idea is to Generalize to
> a
> >> universe of "any villages or clusters." That is, the target population
> we
> >> are generalizing to is *any* random population.
> >>
> >> On Sat, Jul 30, 2022 at 3:01 PM James Pustejovsky <jepusto using gmail.com>
> >> wrote:
> >>
> >>> Hi J.D.,
> >>> A few comments/reactions inline below.
> >>> James
> >>>
> >>> On Wed, Jul 27, 2022 at 5:37 PM J.D. Haltigan <jhaltiga using gmail.com>
> wrote:
> >>>
> >>>> ...
> >>>>
> >>> In the original investigation, the authors did not invoke a random
> >>>> effects model (but did use the pairIDs to control for fixed effects as
> >>>> noted and with robust SEs). Thus, in the original investigation there
> was
> >>>> *no* specification of a random effects model for the 'cluster'
> variable. We
> >>>> know from some other work there were some biases in village mapping
> and
> >>>> other possible sources of between-cluster variation that might be
> >>>> anticipated to have influence--at the random intercepts level--so we
> are
> >>>> looking into how specifying 'cluster' as a random effect might change
> the
> >>>> fixed effects estimates for the treatment intervention effect. In the
> >>>> Hamaker et al. language, it is indeed a 'random intercepts' only
> model.
> >>>>
> >>>
> >>> I don't follow how using a random intercepts model improves the
> >>> generalizability warrant here. The random intercepts model is
> essentially
> >>> just a re-weighted average of the pair-specific effects in the original
> >>> analysis, where the weights are optimally efficient if the model is
> >>> correctly specified. That last clause carries a lot of weight
> here--correct
> >>> specification means 1) treatment assignment is unrelated to the random
> >>> effects, 2) the treatment effect is constant across clusters, 3)
> >>> distributional assumptions are valid (i.e., homoskedasticity at each
> level
> >>> of the model).
> >>>
> >>> If the effects are heterogeneous, then I would think that including
> >>> random slopes on the treatment indicator would provide a better basis
> for
> >>> generalization. But even then, the warrant is still pretty
> vague---what is
> >>> the hypothetical population of villages from which the observed
> villages
> >>> are sampled?
> >>>
> >>
> >> *In the most basic model (without baseline controls) the model takes the
> >> form: myModel = lmer(posXsymp~treatment + pairID + (1 | union), data =
> >> myData). I believe--correct me if I am wrong--that this reflects a
> >> random-intercepts only model, but I may be mistaken. If I am, and this
> is
> >> allowing for random slopes on the treatment indicator, then I will need
> to
> >> rethink my statements.  *
> >>
> >>>
> >>>
> >>>> Given this, however, does it also make sense to include the cluster
> >>>> robust SEs for the fixed effects which would account for possible
> >>>> heterogeneity of treatment effects (i.e., slopes) across clusters?s
> >>>>
> >>>> If you're committed to the random intercepts model, then yes I think
> so
> >>> because using cluster robust SEs at least acknowledges the possibility
> of
> >>> heterogeneous treatment effects.
> >>>
> >>
> >> *If the above model does allow for both random intercepts and slopes,
> then
> >> perhaps the use of cluster robust SEs is redundant in some sense since
> the
> >> random slopes would be modeling the heterogeneity in treatment effects?*
> >>
> >>>
> >>>
> >>>
> >>>> Bottom line: in their original analyses, clusters are seen as
> >>>> interchangeable from a conceptual perspective (rather than drawn from
> a
> >>>> random universe of observations). When one scales up evidence to a
> universe
> >>>> of observations that are random (as they would be in the intended
> universe
> >>>> of inference in the real-world), then we are better positioned, I
> think, to
> >>>> adjudicate whether the mask intervention effect is 'practically
> >>>> significant' (in addition to whether the focal effect remains
> marginally
> >>>> significant from a frequentist perspective).
> >>>>
> >>> As noted above, this argument is a bit vague to me. If there's concern
> >>> about generalizability, then my first question would be: what is the
> target
> >>> population to which you are trying to generalize?
> >>>
> >>
> >> *Essentially, the target population we are trying to generalize to is a
> >> random selection of villages. Any random selection of villages. In other
> >> words, villages should not be seen as interchangeable. We are
> interested in
> >> whether the effects generalize to any randomly selected village. *
> >>
> >>>
> >>>
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> --
> Dr. Benjamin Bolker
> Professor, Mathematics & Statistics and Biology, McMaster University
> Director, School of Computational Science and Engineering
> (Acting) Graduate chair, Mathematics & Statistics
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list